Method #1. Tree-based cllassification

Step 1 : Collecting the data

Step 2 : Exploring the data

Step 3 : Training a model on the data

Step 4 : Evaluating model performance

Q1 : If you see the accuracy 0f 100%, what does it mean? Does this mean that we design a perfect model? This is some thing that needs more discussion. Write a few sentences about accuracy of 100%.

Answer : It doesn't mean that the model is perfect when the accuracy score is 100% because it's probably too good to be true and might be overfitted. In the real world it 100% accuracy doesn't sound true...

Method 2 : Random Forest

Q2 : What are the three most important features in this model?

Answer : From below, they are "purpose", "creditability", and "account balance". However, when I run this command many times, it sometimes show that the 3rd importance feature as "age".

Method 3 : Adding regression to trees

Step 1 : Collecting the data

Step 2 : Exploring and preparing the data

Step 3 : Splitting Data and train the model

Step 4 : Evaluating Model Performance

Q3 : What is your interpretation about this amount of RMSE?

Answer : Since it's the error value, we want it to be low as possible, therefore this RMSE value indicates this is not quite ideal...

Method #4. News Popularity

Step 1 : Collecting the data

Step 2 : Pre-processing

Decision Tree

YES!! I got 58% back by dropping "shares" as well!! Thank you Prof.Sadeghian! The accuracy score of Decision Tree here now says 100%, which means something is wrong. When I ran this code before, I got 58.11%.

Random Forest

By dropping "shares" brought the correct accuracy score here too! It's 62%

This time I got 99.9% accuracy here but I've got the accuracy score of Random Forest as 62% before.

From above, the 3 most important features here are "shares", "kw_max_avg:Avg.keyword(max.shares)", and "LDA_02:Closeness to LDA topic 2". But when I ran this code earlier, it gave me "kw_avg_avg:Avg.keyword(avg.shares)" as the most importance feature instead of "shares".